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Abstract 

Background: Currently, data about age-phenotype associations are not systematically organized and cannot be 
studied methodically. Searching for scientific articles describing phenotypic changes reported as occurring at a 
given age is not possible for most ages. 

Results: Here we present the Age-Phenome Knowledge-base (APK), in which knowledge about age-related 
phenotypic patterns and events can be modeled and stored for retrieval. The APK contains evidence connecting 
specific ages or age groups with phenotypes, such as disease and clinical traits. Using a simple text mining tool 
developed for this purpose, we extracted instances of age-phenotype associations from journal abstracts related to 
non-insulin-dependent Diabetes Mellitus. In addition, links between age and phenotype were extracted from 
clinical data obtained from the NHANES III survey. The knowledge stored in the APK is made available for the 
relevant research community in the form of Age-Cards', each card holds the collection of all the information 
stored in the APK about a particular age. These Age-Cards are presented in a wiki, allowing community review, 
amendment and contribution of additional information. In addition to the wiki interaction, complex searches can 
also be conducted which require the user to have some knowledge of database query construction. 

Conclusions: The combination of a knowledge model based repository with community participation in the 
evolution and refinement of the knowledge-base makes the APK a useful and valuable environment for collecting 
and curating existing knowledge of the connections between age and phenotypes. 



Background 

The relationship between age and human health has been 
extensively investigated over the years and age has been 
linked to a plethora of so-called age-related diseases [1]. 
A patient's age may effect the course and progression of 
a disease [2,3] or may be an important factor in deter- 
mining the right course of treatment [4]. As a result of 
these endeavors, a significant quantity of data exists link- 
ing specific ages or age ranges with disease, as well as 
with other clinical phenotypes such as 'normal' parameter 
values from blood tests. 

Currently, data relating to age-phenotype associations 
are not systematically organized and cannot be studied 
methodically. For example, searching for scientific articles 
describing the phenotypic changes reported to occur at a 
given age is not possible for most ages. Consider for exam- 
ple, the search for data on the change in serum hemoglo- 
bin levels in post-partum women aged 30. It is fairly 
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straightforward to discover works that discuss postpartum 
hemoglobin level but simply searching for works associat- 
ing hemoglobin with the term 'age 30' yields a huge 
volume of work that mentions either the age of the indivi- 
dual or the average age in the cases that were studied. It is 
then necessary to find, by reading through the papers or 
abstracts, that subset of these works that discuss some 
events or trends relating to a particular age. For age- 
related phenotypic changes to be useful for medical 
research, not only are interactive searches much needed, 
but the information retrieved should be amenable to com- 
putational analysis. One approach to achieving these goals 
is by means of a knowledge-base in which age-related 
events, trends and phenotypic changes are recorded in a 
structured knowledge representation model. With such a 
knowledge-base, it should be possible to support dynamic 
queries about events known to occur at specific ages or 
within age ranges. Moreover, such a knowledge-base will 
allow mining of existing knowledge for novel insights into 
age-related processes. 

Over the years, many efforts have been made to repre- 
sent biomedical knowledge and make it accessible for 
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structured querying and computational mining, leading 
to the emergence of many bio-medical knowledge-bases 
[5]. However, most of the knowledge-bases that are con- 
cerned with human phenotypes do not represent the 
connection of these phenotypes to age. One such knowl- 
edge-base is the Human Phenotype Ontology (HPO) [6], 
which provides a standardized vocabulary of phenotypic 
abnormalities encountered in human disease. The HPO 
includes a set of terms which allow the description of 
some age-related information such as age of onset (for 
example, the term 'Onset in early adulthood') but 
detailed knowledge linking age to phenotype is not 
included. Online Mendelian Inheritance in Man 
(OMIM) [7], another major phenotype-related knowl- 
edge-base, maintains knowledge about age and the links 
to disease, yet this knowledge is not structured and 
includes only a limited range of phenotypes, such as 
genetic disorders. 

We present here the Age-Phenome Knowledge-base 
(APK), in which knowledge about age-related phenotypic 
patterns and events is modeled and stored. The knowl- 
edge-base holds a structured representation of knowl- 
edge, derived from the literature, about clinically-relevant 
traits and trends which occur at different ages such as 
disease symptoms and disease propensity. In addition, 
the database underpinning the knowledge-base can hold 
information about trends and trend changes in biomarker 
values and anthropomorphic properties. Disease and age 
are described using ontologies, allowing abstraction in 
searches (for example, searching for evidence linking 
"infectious diseases" and "children" instead of searching 
for a specified list of diseases and a range of ages as 
would be required for a relational database search). The 
knowledge-base is populated, using computational text- 
mining tools and human curation, with age-related 
knowledge as found in the medical and scientific litera- 
ture. In addition, the knowledge-base contains patterns 
extracted from raw clinical data. Earlier analysis of such 
patterns, using data sources such as the National Health 
and Environmental Nutrition Survey and the Soroka 
Maternity Database showed that non-trivial age trends 
can be extracted from raw clinical data [8,9] . By combin- 
ing these patterns with the age-related knowledge to be 
found in the research corpus, it should be possible to 
suggest new ways to consider patient age in the improve- 
ment of patient care and to propose new research 
avenues. 

Results 

Overall Structure of the APK 

The Age-Phenome Knowledge-base contains evidence 
connecting specific ages or age groups to phenotypes, 
such as clinical traits. It comprises a relational database, 
ontologies and a wild used for knowledge representation 



to the user community. The database stores all the evi- 
dence instances and the ages and phenotypes to which 
they are linked. Age and diseases are described using 
the Age Ontology and the Human Disease Ontology 
respectively. The Age Ontology is a very simple ontol- 
ogy, developed specifically for this research, that allows 
ages to be represented by grouping them into classes 
similar to those defined in the Medical Subject Headings 
(MeSH) controlled vocabulary [10]. In the Age Ontol- 
ogy, age is defined as the time which has passed since 
birth and is an attribute of a person (Figure 1). Minor 
changes were introduced to the MeSH age definitions in 
order to improve their formal logic consistency. For 
example, in MeSH the term 'Child' is defined as A per- 
son 6-12 years of age' while the term 'Preschool child' is 
defined as 'A person 2-5 years of age'. To make the 
class 'Preschool child' logically a sub-class of 'Child' in 
the Age Ontology, the definition of 'Child' had to be 
changed to A person 2-12 years of age'. Though the 
Age Ontology currently is small and has a very simple 
design, its hierarchal structure is serving the intended 
uses of the APK. In order to achieve a richer description 
of ages this ontology will be further developed in the 
future. 

The second ontology in use for the APK is the Human 
Disease Ontology (DO) [11] which organizes and 
describes diseases and was designed for use in clinical 
research and in Medicine. DO was incorporated into the 
APK to represent diseases as one form of phenotype. An 
additional resource, a collection of clinical traits not 
found in DO but occurring in the APK source docu- 
ments, allows the scope of the APK to be extended to 
cover other clinical age-related processes, such as 
changes in common clinical markers. This collection may 
be improved by adding higher-order terms that abstract 
its terms, giving a meaningful structure to the collection 
in the context of the APK. This kind of abstraction could 
be supported by importing relevant terms from the 
SNOMED-CT clinical terminology [12]. However, since 
this ontology requires licensing, its incorporation into the 
APK will be considered for a future enhancement of the 
APK. 

Each evidential text fragment stored in the knowledge- 
base is linked to the specific age (or age range) and pheno- 
type^) mentioned by it. Five different types of relation- 
ships were used to describe the links between evidence 
and age. These five types describe the most commonly 
occurring age-phenotype relationship contexts: Age of 
Onset', Age of Diagnosis', Age of Observation', Age of 
Occurrence' and Age of Evaluation'. Age of Onset' is used 
to link age to evidence describing the age of the beginning 
of manifestation of the phenotype. 'Age of Diagnosis' 
is used when the evidence describes the age at which a 
phenotype (usually a disease) is first diagnosed. Age of 
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Figure 1 Age-range classes defined by the Age Ontology. Age-range classes were generally defined based on the MeSH age range 
definitions, with minor changes introduced to improve consistency. In the Age Ontology, age-range classes that are contained within another 
age-range class are defined as subclasses of that class. Each age-range class is made disjoint from all other age-range classes since each specific 
age can belong to only one age range class. * This age range was modified from the MeSH definition. 



Observation' is used in cases where an observation is made 
at a specific age, for example a change in the phenotype 
'hemoglobin level'. 'Age of Evaluation' is used when the 
evidence describes an evaluation of a phenotype in a cer- 
tain age or age range, only indirectly asserting that these 
ages may have been selected to evaluate the phenotype for 
a reason. 'Age of Occurrence' is used to link age to evi- 
dence when the evidence describes a non-dependent 
occurrence of phenotype at a certain age but without 
asserting that the age has a direct effect on the phenotype. 

To demonstrate the approach used here for capturing 
and making medical knowledge linking age to disease avail- 
able, the APK was populated with concrete examples. 
Using a simple text mining tool developed for this purpose, 
139 instances of age-phenotype associations were extracted 
from 100 abstracts related to non-insulin-dependent Dia- 
betes Mellitus. A further 239 examples of links between 
age and phenotype were extracted from clinical data 
obtained from the NHANES III survey. The knowledge- 
base was then populated with the resulting 378 evidence 
instances, together with their mapping to the correspond- 
ing age and phenotype. A summary of the data stored in 
the APK is provided in Figure 2. During the manual popu- 
lation of the knowledge-base, it was noticed that it is some- 
times challenging even for a human reader to select the 
relationship type that best describes the type of age referred 
to in the text segments. For example, in the text "Serum 
and lipoprotein lipids were examined in 133 newly diag- 
nosed (type II) diabetic patients (70 men, 63 women), aged 
45-64 yr, and in 144 randomly selected non-diabetic control 



subjects of similar age (62 men, 82 women). The serum total 
cholesterol levels in diabetic and non-diabetic subjects were 
similar, but the HDL-cholesterol levels were lower and the 
serum total triglyceride levels higher in the diabetic than in 
non-diabetic subjects" there are two statements made 
regarding the age range 45-64 years. The first is that the 
patients investigated were 45-64 years old when diagnosed 
with Diabetes, and the second is that this age range was 
used when evaluating cholesterol and triglyceride levels in 
diabetic patients. In order to capture the age-related 
knowledge in this text, two evidence instances were 
created. The first (APKID_138) linking Diabetes to the 
ages 45-64 with the Age of Diagnosis' relationship; and the 
second (APKID_101) linking these ages to 'Diabetes', 
'serum total cholesterol', 'HDL-cholesterol levels', and 
'serum total triglyceride levels' with the Age Of Evaluation' 
relationship. Automatically resolving this kind of relation- 
ship assignment would require sophisticated natural lan- 
guage processing tools and it seems unlikely that a fully 
automatic system could be developed in the near future. It 
was therefore decided to harness the "wisdom of the 
crowd" by engaging user communities in identifying and 
correcting erroneous machine interpretations of text. This 
approach is manifested in the development of "age cards" 
that were made available to interested readers as pages in a 
wild. The wild technology was successful in engaging a 
community of relevant readers in amending and extending 
the knowledge items, with Wikipedia as the shining exam- 
ple [13]. As in Wikipedia, readers of the age-phenome Age 
Cards can add comments, propose new entries and correct 
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Figure 2 The Age-Phenome database's contents summary. A. Instances per relationship type. Only instances extracted from PubMed 
abstracts are shown. B. Instances per Age Ontology age class. An instance is assigned the most specific Age Ontology class which contains the 
whole of the age range that that instance linked to. C. The top diseases from DO linked to instances in APK. 



or delete existing entries. For the development of the APK, 
the community edits are collected from the history section 
of the Wild, and incorporated into the knowledge-base fol- 
lowing manual curation. 

Many scientific texts refer to age ranges, such as 2-7 
years, rather than one specific age. In such cases, the evi- 
dence instance is linked to the age range defined in the 
database by the minimum and the maximum values of the 
range. In some cases, the age range is not fully con- 
strained, for example "30 years or less" and there are two 
possible approaches to resolving this issue. The evidence 
instance could be linked only to the age specifically men- 
tioned, that is, 30 years in this example. In this case, some 
information will most likely be missed. The second option, 
and the one employed in the APK, is to link the evidence 
instance to all the possible ages covered by the expression, 
producing links to 0-30 for "30 years or less". Although 
this approach might add erroneous links, since the author 
might not have intended all ages under 30 years, no infor- 
mation will be missed. In future development of the APK, 
more sophisticated natural language processing will be 
used to attempt to locate and extract a more precise age 
range from the full article since this is likely to appear in 
the methods or results sections. For example, in a text dis- 
cussing "children", these sections may contain the exact 
range such as 3-6 years of age. 

Using the APK 

One way to use the APK is to extract all the information 
linked via evidence instances to a specific age or age- 
range. This can answer questions such as "what type of 
studies involve 30 year old individuals?" In the example 
shown in Figure 3A, the age '30 years' was queried and the 
phenotypes linked to this age via the 'Age of Evaluation' 
relationship are shown. 

Another way to use the APK is to locate all the speci- 
fic ages and the age-ranges in the Age Ontology that are 
linked (via evidence instances) to a specific phenotype of 



interest or to a combination of phenotypes. In the 
example shown in Figure 3B, all the ages linked to 
'Non-insulin-dependent Diabetes Mellitus' AND 'Hyper- 
tension' were extracted, with the results being filtered by 
their year of publication. From the query results, it may 
be seen that in the APK there are three evidence 
instances in which these two diseases (phenotypes) were 
evaluated and that the age-ranges used for the evalua- 
tions partially overlap. 

It must be stressed that the examples provided here are 
somewhat simplistic. More complex queries can use, for 
example, the Age Ontology and the Disease Ontology's 
hierarchal structures. It is possible, for example, to search 
for evidence linking childhood infectious diseases and 
Diabetes, using the Age Ontology to translate "child- 
hood" into an age range and to enumerate those diseases 
in the APK that are associated with the "infectious dis- 
eases" class in the DO. Thus, as the APK becomes more 
extensively populated with evidential instances, queries 
can become more complex and their results are expected 
to become more interesting and informative. 

It should be noted that, in the current implementa- 
tion, conducting complex queries in the APK requires 
some familiarity with the SQL language. 

The Age-Phenome Wiki 

The Age-Phenome Wiki was developed as a means to 
share the knowledge stored in the APK and to harness the 
knowledge and expertise of the user community in further 
developing the APK. This Wiki is constructed of 'Age 
Cards' which are generated in a semi-automatic fashion. 
For each specific age there is an Age-Card which contains 
all information linked to that age to be found in the APK. 
Each Age-Card has two versions. The first version is for 
viewing the information and uses highlighted text and col- 
ors for clearer presentation. The second version is created 
for community edits, avoiding formatting instructions 
(such as coloring) making it easy for untrained users to 
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Figure 3 Querying the APK. A. the APK was queried for phenotypes linked to age 30 years. The results show 4 (of 19) evidence instances that 
were found linked to this age via the 'Age Of Evaluation' relationship and the different phenotypes linked to these evidence instances. B. the 
APK was queried for ages linked to the phenotypes 'Non-insulin-dependent Diabetes Mellitus' and 'Hypertension'. The query was restricted for 
evidence instances with a publication year after 1990 and the age-range classes to which the resulting ages belong to were also extracted from 
the Age Ontology. The results of this query show three evidence instances linked to both phenotypes, their publication year, the ages each 
evidence instance is linked to and the age-range classes they belong to. 



edit the text with a simple editor. Edits are integrated back 
into the database, allowing the knowledge-base to grow 
and become more accurate based on readers' input. Cur- 
rently, this integration is carried out manually by curators 
who review the revision history of the cards. However, as 
the volume of editing increases, the tools that are currently 
used to read and parse the literature will be used to parse 
the edits, reducing the manual review and editing load. 

Discussion 

There is a wide consensus that age is an important fac- 
tor when considering phenotypic changes in health and 
disease. Though relationships and dependencies between 



age and phenotype can be extracted from clinical data 
and are widely discussed in medical literature, there is 
at present no system which allows methodical investiga- 
tion of the biological meaning of such relationships. 
Even the task of assembling all the information about a 
specific age is labor intensive, requiring the review of 
massive bodies of literature, and may only yield a small 
number of relevant works. 

The Age-Phenome Knowledge-base is being developed 
in order to create a knowledge representation of multiple 
types of links between age and phenotypes. These links 
are supported by various forms of evidence, such as 
excerpts from scientific publications and results of clinical 
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data analysis. Once the knowledge-base is richly populated 
with both evidence instances and with relationships which 
connect age to phenotype via the evidence instances, it 
can be used to answer questions regarding different ages 
and how they might affect phenotype (and vice versa). Sev- 
eral potential uses may be envisaged for the APK, the most 
straightforward of these being to present age-phenotype 
interrelationships in a readable and searchable form. To 
reach its critical mass of current knowledge, it seems very 
likely that the development of the APK will require a com- 
munity effort, much like other knowledge-bases [14]. To 
facilitate this effort, a wiki-structured representation of the 
knowledge-base has been implemented. Through this wiki, 
users are able to review and amend the information stored 
in the knowledge-base, flagging erroneous interpretation 
of the text or suggesting alternative or contradictory evi- 
dence. For example, if an outdated abstract connects a dis- 
ease to a specific age and this connection has since been 
disproved then the user can edit the entry, suggesting the 
correction to the curators. To facilitate the inclusion of 
user-added knowledge whilst protecting the integrity of 
the knowledge-base, user suggestions are processed 
(manually or automatically) prior to their transfer back 
into the knowledge-base. 

The principal use of the knowledge-base is to answer 
questions regarding linkages between age and phenotype. 
An example of this kind of use is the observation made in 
our lab, that post-partum hemoglobin peaks at the age of 
30. This raises the question of what other phenotypic 
changes occur at that age. With the APK, it is possible to 
query the knowledge-base and find which phenotypes are 
linked to this precise age (30), or approximate ages (29-31, 
for example) or the same age group (adults). The query 
results can be filtered by types of evidence which link the 
age and phenotype and by parameters such as publication 
dates of text evidence. Also, since phenotypes are defined 
with an ontological structure, the query could be restricted 
to phenotypes that pertain to a specific physiological sys- 
tem. The APK could also be used to assist users in experi- 
mental design by identifying the optimal age or age group 
for investigating a specific phenotype. 

The use of formal ontologies for representation of phe- 
notypes and age allows queries at any level of abstraction- 
generalization that interests the user. Phenotypes linked to 
a certain age can be viewed at different resolutions (high 
level or detailed) and ages linked to a specific phenotype 
can be viewed both as specific ages such as 30 years old or 
more general age-range classes, like adult. This approach 
can be further extended to consider unsupervised searches 
for patterns. Given a well populated database, it should be 
possible to generate promising hypotheses by detecting 
linkages between phenotypes and recurrent critical ages. 

Another use of the knowledge-base is to redefine 
meaningful age groups. Currently, age groups in the Age 



Ontology are fashioned after the definitions used in 
MeSH [10] and seem to be rather simplistic. Once the 
knowledge-base is more extensively populated with evi- 
dence instances, it should be possible to re-evaluate 
these classifications of ages. For example, if many evi- 
dence instances indicate some age-ranges that are not 
currently defined as a group, this may indicate that the 
age-range is important with respect to a specific aspect 
of Medicine or Biology and should be added to the Age 
Ontology. 

Future development of the APK 

The APK is currently at an early stage of development 
and further work is needed for the knowledge-base and 
wiki to reach their full potential. One of the goals of 
this paper is to engage potential users in contributing to 
the APK, either by editing the relevant wiki entries or 
by directing attention (also through the wiki) to specific 
areas in which age-related knowledge is likely to be 
helpful. Evaluation of user feedback will also allow the 
improvement of the Age-Phenome Wiki. 

Presently, adding knowledge to the APK requires 
manual editing by domain experts. In order for the APK 
to become widely used, it must be extended to include 
far more complete descriptions of existing knowledge. It 
is worth noting here that most of the applications of the 
APK involve inspection of the results by domain experts. 
As a result, a process that adds knowledge automatically 
to the APK, even with only moderate accuracy, could 
still be very useful. Consequently, tools are now being 
developed that will be able to extract relevant knowl- 
edge linking ages to phenotypes from the millions of 
processed PubMed abstracts. A more robust knowledge 
extraction pipeline is being developed that will mine 
abstracts listed in PubMed for relevance to any age, and 
find any of the predefined relationships to phenotype, 
such as different diseases. For this next stage, existing 
information extraction techniques and programs [15,16] 
will be used with their possible adaptation to this speci- 
fic use. 

From the experience gained in this work, it has 
become evident that it is necessary to analyze the full 
text of the relevant literature; some abstracts are 
ambiguous and some lack critical information. It is 
therefore planned to mine Pub Med Central [17] to 
seek age-phenotype relationships in whole papers. In 
the future, collaborations with relevant publishers and 
content providers may allow even more complete cov- 
erage of the literature. Advancing to analysis of the full 
text of the relevant papers will require more powerful 
text mining techniques, and will involve specific 
adjustment of the pipeline for complete text analysis, 
for example treating the methods and the results sec- 
tions differently. 
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Conclusions 

The design, development and preliminary testing of a 
knowledge-base of age-related phenotypic changes in 
humans are described. At the current level of knowledge 
represented in the APK, it is limited to being a demon- 
stration of the concept and its potential uses. Neverthe- 
less, the Age-Phenome wild may engage the research 
community and bring about rapid population of the 
knowledge-base with knowledge from specific domains 
such as Diabetes. The APK will immediately make this 
knowledge searchable in ways which are not currently 
achievable in other knowledge sources. The combination 
of a structured knowledge model along with community 



participation in its evolution, makes the APK a promis- 
ing platform for collecting existing knowledge concern- 
ing the interplay between age and phenotypes. 

Methods 

The Age-Phenome database 

The database, written in MySQL [18], comprises four 
main data tables (Figure 4): 

(i) The evidence table contains all the pieces of evi- 
dence and their description. 

(ii) The evidence-age table contains a description of 
the age information found in each evidence instance. 
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Figure 4 The Age-Phenome Database's overall structure. The Age-Phenome database comprises of 4 main data-tables: (1) Evidence, (2) 
Evidence-age, (3) Evidence-phenotype, and (4) Age-Phenotype relationships. In the Evidence table each evidence instance has a unique identifier 
and is characterized by: 'Evidence type', 'Evidence text', 'Analysis description' (for evidence computationally derived from clinical data rather than 
literature), Title', 'Publication', 'Publication type', 'Publication year', 'MeSH terms' (for evidence taken from literature), 'ID in source', 'Source name' 
and 'Source URL'. The 'Evidence text' field holds an evidential text fragment. In the evidence-age table, each entry records the age or age range 
that the evidence instance is linked to, using 'range min' and 'range max' fields to describe ranges as well as single ages (using identical range 
min and max values). In the evidence-phenotype table, values for the 'disease' and 'Additional phenotypes' fields are taken from the Disease 
Ontology, and a locally compiled additional phenotypes list (phenotypes such as different blood, available as Additional file 4), respectively. In 
the age-phenotype-relationship table, each evidence instance is linked to one of five different relationships ('Age of Onset', 'Age of Diagnosis', 
'Age of Occurrence', 'Age of Observation' and 'Age of Evaluation'). 
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(iii) The evidence-phenotype table links phenotypes 
to each evidence instance. 

(iv) The age-phenotype-relationship table is used to 
record the type of relationship linking the age and 
phenotype mentioned in each evidence instance. 

Ontologies 

The Age Ontology was developed and maintained using 
the Protege 4.0 Beta OWL editor [19], and was written 
in the OWL-DL language. The ontology is available as 
Additional file 1. The Human Disease Ontology (DO) 
[11] was downloaded (December 2009) from the Open 
Biomedical Ontologies Foundry [20] and imported into 
the knowledge-base. 

Information extraction 

As a test case, PubMed [17] was searched for articles on 
non-insulin dependent Diabetes Mellitus (Type 2 Dia- 
betes). The search was restricted to human-related arti- 
cles with all the 'ages' search options selected in the 
Advanced' search settings, as of November 2009. This 
search yielded approximately 22,000 abstracts. 

A script, implemented in Perl, used a list of regular 
expressions (available as Additional file 2) to determine 
which of the abstracts were related to age. One hundred 
of the resulting abstracts were then manually searched 
for the specific age being addressed, the type of connec- 
tion to phenotype and the specific disease. From these 
abstracts 139 instances for the knowledge-base were cre- 
ated, each describing the link of a specific age (or age 
range) to a disease or set of diseases. Basic age-related 
information contained in these abstracts was captured 
for each instance in the APK. 

Clinical data 

Data obtained from the NHANES III survey [21] was ana- 
lyzed for age related trends. For a full description of the 
analysis see the "A method for identifying critical ages 
from clinical laboratory cross-sectional data" published at: 
http://rubinlab.med.ad.bgu.ac.il/APK_clinical.html. 

An evidence instance was generated for each observa- 
tion made from the data analysis; 239 evidence instances 
were added to the knowledge-base, together with their 
connections to age and phenotype. 

The Age-Phenome Wiki 

The Age-Phenome Wiki was built using the Media Wiki 
platform [22] and is available at: http://age-phenome- 
wiki.med.ad.bgu.ac.il 

Database availability 

The APK database is available as a MySQL dump in 
Additional file 3 and a list of additional phenotypes that 



were mapped to evidence texts in APK can be found in 
Additional file 4. 

Additional material 



Additional file 1: The Age Ontology. This file contains the Age 
Ontology, developed in the OWL-DL language. 

Additional file 2: Age relating terms. This file contains a list of regular 
expressions that were used to determine which of the abstracts were 
related to age. 

Additional file 3: The APK database. This compressed folder contains 
the APK database as a MySQL dump. 

Additional file 4: List of additional phenotypes. This file contains a list 
of additional phenotypes that were mapped to evidence texts in APK. 
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